Deep Relaxation: partial differential equations for optimizing deep neural networks
نویسندگان
چکیده
We establish connections between non-convex optimization methods for training deep neural networks (DNNs) and the theory of partial differential equations (PDEs). In particular, we focus on relaxation techniques initially developed in statistical physics, which we show to be solutions of a nonlinear Hamilton-Jacobi-Bellman equation. We employ the underlying stochastic control problem to analyze the geometry of the relaxed energy landscape and its convergence properties, thereby confirming empirical evidence. This paper opens non-convex optimization problems arising in deep learning to ideas from the PDE literature. In particular, we show that the non-viscous Hamilton-Jacobi equation leads to an elegant algorithm based on the Hopf-Lax formula that outperforms state-of-the-art methods. Furthermore, we show that these algorithms scale well in practice and can effectively tackle the high dimensionality of modern neural networks.
منابع مشابه
A unified deep artificial neural network approach to partial differential equations in complex geometries
We use deep feedforward artificial neural networks to approximate solutions of partial differential equations of advection and diffusion type in complex geometries. We derive analytical expressions of the gradients of the cost function with respect to the network parameters, as well as the gradient of the network itself with respect to the input, for arbitrarily deep networks. The method is bas...
متن کاملPhysics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations
We introduce physics informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this second part of our two-part treatise, we focus on the problem of data-driven discovery of partial differential equations. Depending on whether the available data is sca...
متن کاملPhysics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations
We introduce physics informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this two part treatise, we present our developments in the context of solving two main classes of problems: data-driven solution and data-driven discovery of partial differe...
متن کاملBeyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations
Deep neural networks have become the state-of-the-art models in numerous machine learning tasks. However, general guidance to network architecture design is still missing. In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discreti...
متن کاملIntegration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery
The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1704.04932 شماره
صفحات -
تاریخ انتشار 2017